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Abstract. In a 1976 paper published in Science, Knuth presented an algorithm to sample 
(non- uniform) self-avoiding walks crossing a square of side k. From this sample, he constructed 
an estimator for the number of such walks. The quality of this estimator is directly related 
to the (relative) variance of a certain random variable Xj.. From his experiments, Knuth 
suspected that this variance was extremely large, so that the estimator would not be very 
efficient. 

A few years ago, Bassetti and Diaconis showed that, for a similar sampler that only gener- 
ates walks consisting of North and East steps, the relative variance is 0{\/k). In this note we 
go one step further by showing that, for walks consisting of North, South and East steps, the 
relative variance is of the order of 2*('°+^'/(fc + 1)^^ , and thus much larger than exponential 
in k. We also obtain partial results for general self-avoiding walks, suggesting that the relative 

,2 

variance could be as large as fi for some fj. > 1. 

Knuth's algorithm is a basic example of a widely used technique called sequential impor- 
tance sampling. The present paper, following Bassetti and Diaconis' paper, is one of very few 
examples where variances of the estimators can be found. 



1. Introduction 

A self-avoiding walk (SAW) on a graph is a walk that never visits the same vertex twice. 
Let Wfc be the set of SAWs on a fc x fc square grid, going from the South- West vertex to the 
North-East vertex (Figure [J). In his paper "Coping with finiteness" [HI [13], Knuth described the 
fohowing algorithm to generate a (non- uniform) random walk of Wfe : start from the South- West 
corner, and at each time, choose with equal probability (which can be 1/3, 1/2 or 1) one of the 
eligible steps. A step is eligible if, once appended to the current walk, it gives a self-avoiding 
walk which can be extended so as to end at the North-East corner. In this way the walk is never 
trapped and the algorithm always succeed^. Figure [1] shows the probabilities of the 12 possible 
walks when k = 2. Two bigger examples (fc = 10, fc = 100) are shown in Figure[5J This procedure 
is a basic example of a widely used technique called sequential importance sampling [5l[Sl[S|- 

Denote by p(w) be the probability to draw the walk w G Wk- Consider the random variable 
Xk = l/p{'w), where w is a random walk of Wfe drawn according to the distribution p{-). Clearly, 



Hence one can estimate the number of SAWs crossing a fc x fc square by generating N walks 
w^^^ , • . • , w^-^^ of Wfe, and computing 

N 



N ^ 



(1) 



N ^ w(u;(»))' 

z— 1 ^ ' 

By generating several thousand walks for k = 10, Knuth obtained 

iWiol (1.6 ±0.3) X 102^ 

which is quite good compared to the now known exact value, 1, 568, 758, 030, 464, 750, 013, 214, 100 
(see |ni[I3])- We have reproduced Knuth's experiment, and found, with a first group of 10,000 
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^We describe in the final section how to detect algorithmically when a new step traps the walk. 
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Figure 1. The 12 self- avoiding walks crossing the 2x2 square. For each of 
them, we give the sequence 1/pi, l/p2, ■ ■ ■ where pi is the probabihty of the ith 
step. The probabihty of the walk is thus the reciprocal of the product of the 
terms in the list. Two walks have probability 1/8, six have probability 1/12, 
and four have probability 1/16. Two walks that differ by a diagonal symmetry 
have the same probability. 



walks, the estimate 1.78 x 10^"', and with a second group, the estimate 1.38 x 10^^. As observed 
by Knuth, the values ^^J^.^^ vary a lot (a small sample of 10 walks gave us values ranging from 

10^^ to 10^"'), and one may suspect that the variance of Xk is probably much larger than E(Xfc)^, 
or, in other words, that the relative variance 

is large. Note that 

Also, observe that the variance of the estimator ([1]) is YaT{Xk)/N , so that Var(Xfc) is a measure 
of the quality of this estimator. 

Knuth's observation led Bassetti and Diaconis to study a simpler algorithm, in which a step, 
to be eligible, has to go North (N) or East (E) [5]. The resulting walk is called a directed walk, or 
a NE-walk. Each step has probability 1/2 unless it follows the North or East side of the square 
— in which case it has probability 1. Denote by Vk the set of directed walks of Wk, and by p{w) 




2 4 6 8 10 20 40 60 80 100 



Figure 2. Left: A SAW crossing the 10 x 10 square. The thick steps have 
probability 1. That is, each of them is the only eligible step at the time when 
it is chosen. Right: A SAW crossing the 100 x 100 square, obtained via Knuth's 
algorithm. 
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the probability to generate the directed walk w with this new algorithm. Define the random 
variable Xk = l/p{w) as above. Then 

k 



V nk 

By the above argument, a walk that hits for the first time the North of East side of the square 
at time k + i has probability 1/2*'+*. Since there are 2('^+p^) such walks, 



ml) = E 



„ p(w) ^ V * 



The corresponding generating function is 



Y.ml)^'' 



2x 

l + 2x VVl - 16a; 



(2) 



and an elementary singularity analysis |10| gives 



ml) 



16'= 



which is roughly y/k times larger than 



mkf 



16'= 
■nk 



In this note, we first take one more step in the direction of the general problem by declaring 
that South steps are also eligible. The resulting walks are partially directed wa\ks^ or NES-walks. 
The probabilities of the 9 walks obtained when fc = 2 are shown in Figure |31 Of course, these 
probabilities are not the same as those obtained from Knuth's original algorithm. 
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Figure 3. The nine NES-walks crossing a 2 x 2 square, with the reciprocals of 
the probabilities of their steps. 

We will prove that one outcome of this increased generality is that the ratio between lE(X^) 
and E(Xfc)^ becomes much bigger: 



l{Xl) - I 2'=('^-+i) while E(Xfc)2 = (fc + r 



2k 



Since the x/y symmetry is lost with these walks, and random NES-walks in a square look a 
bit dull (Figure [3]) , it is natural to generalize the original question by enclosing the walks in a 
rectangle R of height k and width and let fc and i increase at different speeds. Thus, let 'Pk,i 
be the set of partially directed walks that start from the South- West corner of R and end at the 
North-East corner. A walk of Vk,i contains exactly I East steps, and choosing the heights of 
these steps determines the walk completely. Hence the number of walks in Vkj is (fc -I- 1)^. 
Thus, defining the random variable X^^i = l/p(w) as above, we have 



iiXk,i) = iVkA = (fc + 1)' 



(3) 
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Figure 4. Random partially directed walks crossing a square of size k, for 
k = 100 and k = 1000. 

We will prove that, if fc, £ ^ cx) in such a way £ = o(2'^), then 

Var(X,,,)~E(X,2,,)^^2('=+l)^ 
so that the relative variance satisfies 

We actually obtain in Section [5] an explicit expression for the generating function of the numbers 
E(X|^) (as was done for directed walks in ([5])). In Section [3J we derive from this expression 
the above asymptotic results. Let us mention that, even though sequential importance sampling 
is widely used, no general bounds on the variance of the estimators are available. The present 
paper, following [5], is thus one of very few examples where variances can be found. 

Finally, in the last section, we go back to Knuth's sampler for general self-avoiding walks, and 
prove that there exist two positive constants A and /? such that 

E(Xfc)i/'^'' ^ A and E(X^)i/'=' ^ p. 

The former result has actually been known since 1978 [T]. Since a variance is non-negative, 
/? > A^. Upper (and lower) bounds on A have been obtained in [B], based on the determination 
of the numbers E(Xfc) = |VVfe| for small values of fc, and of related numbers counting other 
configurations of self-avoiding walks. A similar study, performed for the numbers E(X|), might 
suffice to prove that /? > A^. We conclude with a few remarks and questions on the importance 
sampling of self-avoiding walks not confined to a box. 

2. Exact results for NES- walks 

In this section, we first describe the probability p{'w) to obtain the walk w in terms of the ge- 
ometry of w fSection l2.ip . This description reduces the determination of the numbers E(X^ ^) to 
the enumeration of NES- walks according to several parameters, which we perform in Section [2.2l 

2.1. The probability p{'w) 

Let wq be a walk of Vk,i^ written as a sequence of N, E and 5 steps. Let w be the prefix of 
Wq that precedes the last E step. That is, = wEN • • • N. By convention, ioq starts at height 0. 

Lemma 1. The probability p{wo) to obtain wq via the importance sampling algorithm satisfies 

2 ^h{w) (2hc{w} i2v{w) -^Vc(w) 

p{wo) 

where 

• h{w) is the number of horizontal steps of w that lie neither at height nor at height k, 

• hc(w) is the number of horizontal contSLCts ofw, that is, horizontal steps that lie at height 
or k, 

• v{w) is the number of vertical steps of w that end neither at height nor at height k, 
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• Vc{w) is the number of vertical contacts of w, that is, vertical steps that end at height 
or k. 

Proof. Assume the walk wq has length n and ends with exactly j vertical steps. The probability 
of the first step is 1/2, and the probability of each of the j final steps is 1. Let Si denote the ith 
step. Hence w consists of the steps si, . . . , s„_j_i. For 1 < i < n — j , the probability of s^+i 
depends on the direction and position of sf. 

• if Si is horizontal, but not a contact, then the probability of s^+i is 1/3, 

• if Si is a horizontal contact, then the probability of s^+i is 1/2, 

• if Si is vertical, but not a contact, then the probability of .s^+i is 1/2, 

• if Si is a vertical contact, then the probability of s^+i is 1. 

The lemma follows. B 

2.2. Enumeration of NES- walks in a strip of fixed height 

Recall the expression ([3]) of the numbers K{Xk.e). For k (the height of the rectangle) fixed, 
the generating function of the numbers ¥,{Xk/)'^ is rational: 

Eix,,r.^ =j:ik+ ifv = ^I'+f;,^ . (4) 



We will determine the variance of X^^i by describing the generating function of the numbers 

k.l) 



E(X^^), which is also rational when k is fixed 



Proposition 2. For any fixed height k, the generating function Mk{x) of the numbers E.{X'^ 
is a rational series: 

M,{x) ■.^Y.^{Xlj)x' = 2x^, 
e>i ^ 
where Nk and Gk are polynomials in x given by the same recurrence relation: 

Nk^{5 + 9x)Nk-2-^Nk^4, 
(and similarly for Gk), with initial conditions 

Ni = 2, Gi = l-Ax, 

N2 = 5 + 3x, G2 = 1- 9a; -6x2, 

iVg = 11 + 9a;, G3 = 1- 19a;- 18a;2, 

iV4 = 23 + 54a; + 27a;2, G4 = 1 - 36a; - 990;^ - 54a;3. 

Example. For k ~2, 

y E(X| ,)x' = 2x ^ + ^^ = lOx + 96a;2 + 0(^3). 
^-^ ' 1 — 9a; — 6a;^ 

e>i 

Figure [3] allows to check that the coefficient of .t^ is correct: 

96 = 4 + 8 + 8 + 12 + 12 + 8 + 12 + 16 + 16. 
The generating function of the variances is 

E^T \ e n 5 + 3a; 9a; 

-'^ l-9x-6x^ -T^x 

Observe that the radius of the first fraction is smaller than the radius of the second fraction. As 
£ -)■ 00, 

mid - 

(up to a multiplicative constant) with jj. — {9 + \/l05)/2 ~ 9.62, while K{X2d'^ — 9^. g 
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We now want to prove Proposition [21 Recall that 

4^ p{wo) 

The expression of p(u'o) given in Lemma [T] leads us to study a purely enumerative problem. For 
k fixed, let Tfc be the set of NES- walks w that start at height and are confined to the strip 
< 2/ < fc. We wish to count these walks by the parameters h{w), hc{w), v{w) and Vc{w). So, 
let 

weTk 

be the associated generating function. This series is easily seen to be rational (there is an 
underlying finite-state automaton), and there are several ways to determine it. We present here 
what we believe to be the most direct one. It relies on a recursive description of the walks of Tk, 
where we add at each time an E step and a sequence of vertical steps. This approach requires 
to take into account an additional parameter, namely the height /{w) of the final point of the 
walk w. Hence our series finally involve 5 variables: 

We will denote by 71 the subset of Tk formed of walks that do not end at height or fc, and by 
Tfc(s) = Tk{x^y,a,b; s) the corresponding generating function. Accordingly, 

k 

Tk{s) = y Tfc,,s' = Tk^o+fk{s)+s''Tk^k, 

i=0 

where Tk^i is the series in x, y, a and 6 counting walks of Tk ending at height i. By Lemma [U 

£>i e>iw„eVk,i^ ' 

weTk 

= 2a;rfc(3.T,2,2a;,l;l). (5) 
Lemma 3. The series Tk{s), Tkfl and Tk,k satisfy the following system of equations: 

l~r^-j^)fkis) 
l-ys l-ysj 

ys-{ys)'' x{ysf ~ ^ ^ f \ , rr, ys ~ (ys)'' ys''-^ - y^ 

= —. -; Tk(l/y) - zTk(y) + aTk.o— h aTk.k — 

1 — ys 1 — ys 1 — ys 1 ~ ys 1 ~ ys 

Tk,o = 1 + bxyfk{y) + aTk.Q + aby'^^'^Tk^k, 

Tk,k = by''-^ + bxy''-^fk{l/y) + aby''-^Tk,o + aTk,k, 

with s := 1/s. 

Proof. We construct the walks of Tk recursively, by adding at each time a horizontal step followed 
by a sequence of vertical steps. 

We partition the set Tk into three disjoint subsets, illustrated in Figure [S] 

• The first subset consists of walks with no E step. These walks consist of i North steps, 
with 1 < i < k. Their generating function is 

k-l , 



ys - { ysf 
ys 
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• The second subset consists of walks in which the last E is followed by a (possibly empty) 
sequence of N steps. We denote by i the height of the last E step, and distinguish the 
cases i = and < i < fc. The generating function of this subset of Tk reads 



fc-i 



ys - (ys) 



k-l 



j=0 



1 - {ys)'^-' 



. a; Tk^^s 



aTk, 



ys - {ysf 



n{s) - {ysffu{l/y) 



1 — ys ^ — ys 

• The third subset consists of walks in which the last E step is followed by a non-empty 
sequence of S steps. We denote by i the height of the last E step, and distinguish the 
cases i = k and < i < fc. The generating function of this subset of Tk reads 



fc-i 



as^Tk^kY^iysy + Tk,,s'Y.^y-sy = aT, 



k,k- 



u k — 1 

ys — y 



1 - ys 



a^y^ ( Tk,tS 



aTi 



k,k- 



ys^ ^ — y^ 



ys - (ys)' 
l-ys 

(ysfkis) - fk{y) 



1 — ys 1 — ys 

Adding the three contributions gives the series Tk{s) and establishes the first equation of the 
lemma. 

The equations for Tkfi and Tk^k are obtained in a similar fashion. g 





Figure 5. Recursive construction of bounded NES-walks. 
We now solve the functional equations of Lemma [3] The key tool is the kernel method (see 

e.g. mmm)- 

Proposition 4. Let k > 1. The series Tfc(l) = Tk{x,y,a^b; 1) counting NES-walks confined to 
a strip of height k is 

Ni. 

Tk{l) - ^, 

^k 

where Nk and Gk are polynomials in x, y, a and b defined by the same recurrence relation 

TVfe = (1 - a; + y2(i + x))Nk-2 ~ V^Nk-i, 

(and similarly for Gk) with initial conditions 

N-i = il-x-xy){b-y)/y^ Go = (x ~ l)ab/y ~ (x + l){a ~ 1), 

No = {b — xb + xy)/y, Gi ~ 1 — a — afe, 

Ni = 1 + b, G2 = {I - x){l - a) - {x + l)yab, 

N2 = 1 — X + y + by{l + x), G3 = {1 — x — xy){l — a) — yab{x + y + xy). 

Equivalently, 



Tk{l) 



Q{S)S^ + Q{S) + [l - y' 



P{S)S^ + P[S) 
where S is the only power series in x and y satisfying 

5+1 = {l + x)y + {l~x)y, 



S^-S 



(6) 
(7) 
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with y ~ 5* = i/S, and 

P{s) = 1 — a + aby — s(ab + y — ay) and Q{s) = 1 — &y + (5 — y)s. (8) 

The reason why we give the expressions of A^_i and A^O: rather than and is that they 
are more compact. The same reason explains why we give Gq rather than G4. It is of course 
easy to compute N3, N4 and G4, and Proposition [5] then follows at once, using ([S]). We hope 
that using the same notation Nk, Gfc, for the enumeration problem of Proposition 2] and its 
specialization of Proposition [2] will not create any confusion. 

Proof. First, we use the last two equations of Lemma[3]to express Tk{y) and Tk{\/y) as linear 
combinations of T^.o and T^ fc. Then, in the first equation of the lemma, we replace Tk{y) and 
Tk{\/y) by their expressions in terms of T^.o and T^^k- The left-hand side is unchanged, and the 
right-hand side now involves only two unknown series, namely T^^o and T^^k- 

l-j^-^)fkis) 
1-ys l-ysj 

1 , V , f ^y-"^ , y(^-'^)\rr. . fc t/(a- 1) yas \^ , . 

^k,a + S \-r— r + - z]-LkM- (9j 



1 ~ ys 6(1 — ys) \l — ys b{l~ys)J ' \b(l — ys) 1 — ys 

The kernel of this equation is the coefficient of Tk{s). It vanishes when s ~ S and s ~ S :~ 1/ S, 
where S is defined in the proposition. Since Tk{s) is a polynomial in s, and S and S are Laurent 
series in x and y with finitely many monomials with negative exponents, the series Tk{S) and 
Tk{S) are well-defined. Replacing s by S* or in the above equation cancels the left-hand side, 
and hence the right-hand side. One thus obtains two linear equations between Tk^o and Tk^k, 
which involve the series S. Solving them gives expressions of T^.o and Tk^k in terms of S (the 
expression of Tk.k is given in (jlip below). By setting s = 1 in ([9]), one then expresses Tfc(l) in 
terms of 5, and finally Tfc(l) = Tk.o + Tk.k + Tk{l)- This gives ©. 

Observe that the expression ^ is unchanged if we replace S hy S — 1/S. Since S and S are 
the two roots of ([7]), this implies that Tk{l) is a rational series in x, y, a and b. However, the 
denominator of ([5]), namely P{S)S'' + P{S), is not unchanged when 5 1— >■ 1/5*. But let us define 
the series Nk and Gk as follows: 

G2k = {PiS)S'' + P{S)S'') , 

G2k+i = ^^_y][^^s^ (P(g)5^+^+P(5)^^), (10) 

N2k = -^(Q{S)S'' + Q{S)S'' + {l-y' f'^^1 ' 
1-y^ \ S -I 

Then it is easy to check that © can be rewritten as Tk{l) — Nk/Gk- Moreover, the series Nk 
and Gk are unchanged when S t-^ 1/S', and thus are rational functions of x, y, a and b. More 
precisely, each of the sequences G2k, G2k+i, N2k and N2k+i is of the form aS'' + fiS^, where S 
and S are the two roots of ([7]). Hence each sequence satisfies the recurrence relation 

Uk = (1 - X + y^(l + x))uk-i - y^Uk-2- 

One easily determines the initial values for each sequence. This yields the description of Nk and 
Gk given in the proposition. From this description, it is clear that Nk and Gk are polynomials, 
as soon as fc > 1. ■ 
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Remarks 

1. The series Tkj, counting walks ending at height i, are also rational, but with a denominator 
that is a proper multiple of the denominator of Tfc(l) = X]i=o-^'=.i- ^'^^ instance, 

T biy'-l)iS-S)S'' 

or, in terms of polynomials. 



k 



Ft ' 

where is defined by the recurrence relation 

Fk = (1 - X + (1 + x)y^)Fk^i - y''Fk-2, 

with the initial conditions 

Fi ^ {1- a- ab){l- a + ab) and F2 ^ {I - a + bya){{l - x){l - a) - {x + l)yab). 

It is not hard to prove that Gk is a divisor of Fk- The simplification that occurs in the de- 
nominator when summing the series T^j has very recently been explained combinatorially by 
Bacher [1]. 

2. The series Tk^k was already determined in the case x = y = a ~ b = t in using the 
same approach as above. The derivation is more involved here because we keep track of four 
parameters in the enumeration, and because we are interested in 71- (1) rather than Tk^k- 

3. Asymptotic results for NES- walks 

We now derive asymptotic results from the previous section. Recall that E{Xk,e) = {k + lY, 
so that the generating function of the numbers K{Xk.eY given by (|4]) (for k fixed). The radius 
of convergence of the generating function of the numbers E(X^^) turns out to be exponentially 
smaller than l/{k + 1)^. 

Proposition 5. Let k > 1. The series Mk{x), given in Proposition^ has a unique pole pk of 
modulus less than 1/9, satisfying 

_ 1 9 12fc- 23 36P - 54A: - 87/8 ^ / A:^ \ 

- ^ + ^-^ITT - 2 . 8^^+i + 16^^^ ^ [s^J- ^ ' 

The residue of ak of Mk{x) at pk satisfies 

3 9fc - 4 - 48/s + 1 Slfc^ - 306fc2 + 7bk +140 ^ ( k"^ \ 

«^-2-^feT^ + 2~¥T^ 2~8^^^ + ^(vl6^j- ^^^^ 

The second moment of Xkj satisfies 

EiXlj)^akPk' + 0i9'k). 
In particular, i/fc,^ — > 00 in such a way £ — o{2^), then 

Ye.r{Xkj) ^ E{Xl,) ^ ^ oJ^+^^\ 

which is much larger than E(Xfe,£)^ = (fc + 1)^^. 

Proof. We proceed in four steps. We first express the series Mfc(x) in terms of an algebraic 
series S*, as was done for the enumerative problem in Proposition Then, we study the analytic 
properties of S. We use these properties to prove that the denominator Gk of is real- rooted, 
with one positive zero pk and all the other zeroes below —1/9. We finally apply Cauchy's formula 
to extract the £th coefficient of A4k{x), which is E(X^^). 

1. The expression of Mk 

By Proposition [21 

Mu{x) = 2x^, 
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where the polynomials and Gk can be described either by induction, or, after performing the 
change of variables x -> 3.t, a — > 2x, y ^ 1 and — > 1 in bjQ 

G2k = -y (P(l/5)5'= + P(5)5-")' 

G2fe+i - -Y^(P(l/5)5^+i + P(5)5-'=), (14) 

A^2.+i = -^[Q{l/S)S^+' + Q{S)S-^-i^^—^ 
where S and l/S* are the two power series in x satisfying 

^+c = ^^, (15) 



S* 2 
or equivalently, 

x = -1(2^-1) (2/5-1). (16) 
The polynomials P(s) and Q(s) are 

P(s) = l + 2x-2s(l-x) and g(s) = -l-s, 
so that, in view of (fTB)) . 

^ (2^^1)(2|1^11^ and P(l/5) = ^^^^^^^i^^. 
2. The series ^(a;) 

From now on, we denote by S the root of (jlSp that has constant term 1/2: 



^ ^ 5 + 9x-3v/(l + .T)(l+9a:) ^^^^ 

Lemma 6. The series S has radius of convergence 1/9, and admits an analytic continuation, 
still denoted by S, m C \ [—1, —1/9]. In this domain, S never vanishes, and its modulus is less 
than 1. 

Proof. The existence of an analytic continuation follows from basic complex analysis. If a; = 
u + iv, the discriminant (H-a;)(l + 9a;) reads 2i;(5 + 9m). Using the principal determination of the 
square root, the analytic continuation of S is given by p7|) when ^{x) > —5/9, and otherwise 
by 

^_ 5 + 9x + + x){l + 9x) 
4 ■ 

A plot of the modulus of S is shown in Figure [H] (left). B 
3. The roots of Gk 

Lemma 7. For k > 1, the denominator Gk of the series Mk is real-rooted. It has a unique 
positive zero pk, which, as k oo, admits the expansion (jl2p . The residue ak of Mk at pk 
admits the expansion (|13p . The other zeroes are smaller than —1/9. 

We could use Rouche's theorem to prove that, for any e > 0, the polynomial Gk has only one 
root of modulus less than 1/9 — £ for k large enough, but the above statement is more precise. 



^From now on, we carefully avoid the notation S ■.= 1/S, since we will soon be doing complex analysis. 
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Proof. The case k = 1 being trivial (Gi = 1 — 4a;). we focus on the case k > 2. By Proposition [21 
the denominator Gk has degree f-^^] • The expressions are symmetric in S and 1/S, and 
thus hold for any determination of S, and thus for any a; G C, including in the cut [—1, —1/9]. 
They show that Gk (x) = if and only if 5 7^ — 1 and 

^ (2^-l)(2g^-llg-4) 
(2~5)(2- 11S-4S'2) ■ 

Conversely, ifseC\{ — l}isa root of 

(2-s)(2-lls-4s2) ^ ^' ^ ^ 

then 

x:=-i(2s-l)(2/s-l) (19) 

is a root of Gk- Observe that in this case, 1/s is also a root of (fT5)) . and gives rise to the same 
root x of Gfc. Conversely, if two distinct roots so and si of ((TS]) give rise to the same root of Gk, 
then si = 1/so- 

It is easy to relate the positions of s and x in the complex plane. By writing s ~ u + iv, one 
finds that x is real if and only if s is real or has modulus 1. If s = e^^ , then x = (4 cos 6* — 5)/9 
lies in [—1, —1/9]. If s is real and negative, then a; < — 1, and the equality holds if and only if 
s ~ —1. If s is real and positive, then a; > —1/9, and x > if and only if s ^ [1/2, 2]. 

Since we want to prove that Gk is real-rooted, let us study the roots of (fTS)) . distinct from 

— 1, that are real or have modulus 1. We will prove that has 

• two pairs {s, 1/s} of real zeroes distinct from —1, one positive outside of [1/2,2], and 
one negative, 

• T'^T^l pairs of zeroes distinct from —1 on the unit circle. 

Consequently, Gk has two real zeroes outside the interval [—1,-1/9], one positive, one less than 

— 1, and [-^Y^l zeroes in [-1,-1/9]. In particular, it is real rooted. 

Real roots of An elementary study of the function i?(s), for s G M, reveals that it 

consists of 4 decreasing branches, shown in Figure [5] (right), with vertical asymptotes at 



11 + 3V17 -11 + 3V17 
s = — ~ -2.9, s = — ~ 0.17, and s = 2. 

8 ' 8 ' 

The branches intersect the s-axis at the reciprocals of these three values (and in particular at 
1/2). Thus in R+, the equation s'^~^ = R{s) has two roots, one below 1/2 and the other beyond 
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2, which are necessarily the reciprocal of each other. The smallest of these increases to 1/2 as 
increases: thus the corresponding value of x decreases to as fc increases. We denote by pk this 
root of Gk ■ 

If k is even, the equation s^^^ = R{s) has also two roots in Mr . If fc > 3 is odd, the curve 
s > s*^^^ intersects the second branch of R{s) at s = —1, but also somewhere before s = and 
s = —0.508 . . . (which is the root obtained for fc = 3). The latter intersection point gives rise to 
a root of Gk smaller than —1. 

The rest of the argument will show that all other roots of Gk lie in [—1,-1/9]. 

Roots of of modulus 1. We first observe that, if s has modulus 1, then the same holds 
for R{s). More precisely, if s = e'^, then R{s) = e"^ with 

56 + 321 cos e - 336 cos^ d + 128 cos^ 6 

COS (I) — " 

^ (5 - 4 cos 61) (157 + 44 COS 6*- 32 cos2 6*)' 

^"^"^ ~ ^ (5-4 cos 61) (157 + 44 cos 61-32 cos2 9) ' 

Plots of cos (j) and sin (/) as a function of 6 are shown in Figure [T] For s = e*^ , Eq. (fT5)) is 
equivalent to cos((fc — 1)9) = cos0 and sin((fc — 1)^^) = sin0. Given that 1/s = e^*^, we can 
focus on solutions such that 9 £ [0, tt]. The oscillations of cos((fc— 1)9) in this interval imply that 
the equation cos((fc — 1)9) = cos(f> admits at least one solution in each interval (^Ej"'"'' FTT'"']' 
for 1 < m < — 1. For each solution, sin((/s — 1)6*) = isint/), and the plot of sine/) in Figure [7] 
shows that sin((fc — 1)6') = sin0 if and only if sin((fc — 1)6*) < 0, that is, if m is even. We finally 
note that, when k is odd, one solution is 9 = tt, giving s = — 1, which we want to exclude. 

This discussion shows that ([T5| has at least [-^7^] solutions s 7^ — 1 with Im(s) > on the 
unit circle. They give rise to as many roots of Gk in the interval [—1,-1/9]. With the two 
real roots of Gk found previously outside this interval, this gives a total of [-^y^] roots, which 
coincides with the degree of Gk- Hence Gk is real rooted, with one positive root pk, and the 
others smaller than —1/9. 




Figure 7. Plots of cos(j) (thick curve) and cos((fc — 1)9) against 9, for 9 G 
[— 7r,7r], when fc = 10 (left) and fc = 11 (middle). Right: Plot of sm(j>. 



In remains to obtain an expansion of pk as k grows. We first work out an expansion of the 
solution of found around 1/2: 

1 _3_ _3_ 36 fc + 27 27-32fc"2-9-16fc-717 / fc^ \ 
- 2 ^ 2*^+2 ^ 4*^+2 ^ 4.8^=+! 16^+2 ^ 1^32^ J ' 

This translates into the expansion of pk using (|19p. 
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The residue ak is then easily derived from Mk ~ 2xNk/Gk and the expressions (|Ti)) of Gk 
and Nk- ■ 



4. Conclusion 

By Lemma [7] and Cauchy's formula, 



l~x/pk) 'ii-K 



dx 



x/pkj x^+^ ' 



(20) 



where C is the circle of radius 1/9 centered at the origin. We will prove that there exists a 
constant C such that for all k and x E C, 



Mkix) 



ak 



< Ck, 



so that (PO)) implies 



1 - x/pk 

[x']Mk{x) = E{Xl,) = akPk' + Oi9'k), 



as stated in Proposition [S] 

It follows from (fT^ and (fT51) that ._°'} — is bounded uniformly in k and x £ C,so that we only 
need to prove that Mk{x) = 0(fc), uniformly in x € C. By LemmajHl for any x € C \ {—1/9}, 
|S(a;)| < 1. Moreover, S[x) -> 1 as a: ^ -1/9. Recall that Mk{x) = 2xNk/Gk. By HH), 
Nk = 2^/'^0{k), so that it sufRces to prove that Gk/i^/"^ is bounded away from 0, uniformly in 
k and x €C. Since 1 + 5 and are uniformly bounded, and P{l/S) is bounded away from 0, 
this is equivalent to 

PiS) 



inf 

k,xec 



> 0. 



By the proof of Lemma [71 S'^ 



P{l/S) 

^^^"^ does not vanish on C. Hence it suffices to prove that 



PUTS) 

liminf inf 

k x^e 



PiS) 



Pil/S) 



> 0. 



Let us write x = -e=^*^/9, with 6 e [0, tt]. Then, as 6* ^ 0, 



Six) 

\Six)\ 
PiS) 



i-^{i^i)V9 + oie), 

l-iVe + O(0), 



20 
13 



ilTi)Ve + Oi9). 



Pil/S) 

We split then interval [0, tt], to which 6 belongs, in three parts. 
• When < 7r/(2fc), there holds, uniformly in 6, 



(21) 
(22) 
(23) 



In particular. 



Moreover, 



uniformly in 9. Hence 



Six)'' = exp (-A:(l T i)Ve/2) + 0(l/fc) 
^iS*") 



= exp i~kV0/2) cosikVe/2) + Oil/k) 
> exp(-7r/4)/A/2 + 0(l/A:). 



^ (i^iTs) ) = ^ + ^^^^^ = ^ + 



^ + i^^) ^ ^ ^ cxp(-V4)/x/2 + 0(l/fc), 
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and 



lim inf inf 

Ve<Tr/{2k) 



P{S) 



P{l/S) 
P{S) 



> 0. 



• Let e > be such that, for ^/O < e, 

\S{x)\<l-Ve/A and -^^^ > 0.9. 

Such an e exists in view of (gH) and For 7r/(2fc) <VO<e, 

\S\'' < (1 - Ve/4f < (1 - 7r/(8fc))^- = exp(-7r/8) + 0(l/fc), 



so that 



Hence 



gk 



P{S) 



Pil/S) 



> 0.9 - exp(-7r/8) + 0{l/k) > 0.2 + 0(l/fc). 
PiS) 



hm inf inf 

7r/(2fc)<Ve< 



> 0. 



P{l/S) 

Finally, when > e, then \S\ < 1 is bounded away from 1, uniformly in 6. Thus, if 

,k , PiS) 



lim inf inf 



gk 



= 0, 



Pil/S) 

there would exist an a; g C such that P{S{x)) = 0. But this only happens when a; = or 
X ~ (—1 ± a/17)/4, and none of these values lies on the circle C. 

This concludes the proof of Proposition [S] B 

4. Back to general self-avoiding walks 

4.1. Partial results on the asymptotics of E{Xk) and E(X^) 

Let us go back to Knuth's original algorithm, described at the beginning of the paper. Recall 
that E{Xk) is the number of SAWs crossing a square of side k, and that K{X'^) is the sum of the 
reciprocals of the probabilies of these walks. 

Proposition 8. Denote c(k) = E(Xfc) and dik) ~ E(X^). There exist two positive constants A 
and j3 such that 

E(Xfc)i/'^-' ^ A and E{Xl)^/''' 
Of course, /3 > A^ . Moreover, 

A = sup c(fc)i/("+i)' 



13. 



(24) 



and 



^ = sup (%/2d(fc))i/('=+i 

k 



(25) 



Proof. As can be expected, these results follow from a super- multiplicativity argument. The 
existence of A was established for the first time in [T], and the lower bound appears in [5]. 
We repeat the argument, because it applies almost verbatim to the numbers d{k). 

Define A := limsupj. c{k)^/^ . Then A is finite, because there are only a quadratic number of 
edges in the k x k square, and a walk is determined by the set of its edges. Let e > 0. We will 
prove that 

liminf c(if)i/-^' > A-e, (26) 
which implies that A is actually the limit of c{k)^/^ . 

Let fc > be such that c(fc)^/'^'"'+^-' > A — e. Let K > k, and let n be maximal so that 

(fc + 1)(277.+ 1) - 1 < K. 

This implies in particular that K < {k + l)(2n + 3). In the K x K square, put (2n + 1)^ smaller 
squares of side fc, as shown in Figure [SI In each smaller square, choose a SAW that crosses it, 
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and build from this collection of short walks a long walk crossing the larger square, as shown in 
the figure. This construction implies 



ciK) > c(fc)(2"+i)' 



Thus 



(2ri+l)^/(2ri+3)^ 



> (A - £)(2n+l)V(2n+3)"_ 



Taking the liminf on K boils down to taking the liminf on n and gives The bound 

also follows from the above inequalities. 



K 




Figure 8. Super-multiplicativity for SAWs crossing a square 



Let us now consider the numbers d{k). Again, /? := limsup(i(fc)^/''" is finite, because 

where \w\ denotes the length of w. Now return to Figure [51 Denote by wi,W2t ■ ■ the short 
walks, and by w the long one. It is clear for the sampling algorithm that 



I 



(2n+lY 



It follows that 



> n 1 

p{w) ~ ^J-^ p{wi)' 



d{K) > 



from which one can prove, as above, that /3 = lim(i(fc)^/'^ . The above bound on d{K) can 
actually be improved: in every row of (2n + 1) small squares, except maybe the top one, n of 
the horizontal steps added between the small squares have probability 1/2 or 1/3. Hence 

d{K) > 22"'d(fc)(2"+i)', 

and the lower bound (1251) now follows. ■ 
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4.2. Unconfined self-avoiding walks 

Knuth designed his algorithm to sample SAWs crossing a square, but his approach can be 
easily adapted to sample (untrapped) SAWs of prescribed length n: start from the origin of 7?, 
and at each time, choose with equal probability (which can be 1/3, 1/2 or 1) one of the eligible 
steps. A step is eligible if, once appended to the current walk, it gives an untrapped self-avoiding 
walk, that is. a walk that can be extended into a SAW of arbitrary length. The growth constant 
of untrapped SAWs is the same as for general SAWs (because all bridges are untrapped |14)). 
We do not know of any more precise result. The average end-to-end distance in a uniform SAW 
of length n is conjectured (and strongly believed) to be of the order of n^^'^. We do not know if 
a similar conjecture exists for uniform untrapped SAWs. 

It would also be interesting to study the asymptotic properties of untrapped SAWs chosen 
according to the non-uniform, but very natural, distribution that results from our sampling 
procedure. In particular, it is likely that the average end-to-end distance will be smaller than 
n^/^ here, because "compact" walks in which few steps are eligible at each time have a higher 
probability than more spread-out walks. Figure [5] shows a random SAW drawn using our recur- 
sive algorithm and a (quasi-)uniform SAW drawn using a pivot algorithm For unconfined 
partially directed walks, however, the end-to-end distance is easily shown to be linear, both for 
the uniform model and for the distribution studied in this paper. Let us mention that the im- 
portance sampling of possibly trapped SAWs of length n (with rejection when the walk gets stuck 
before it reaches length n) has also been considered, with a conjectured end-to-end distance of 
n2/3 [H]. 




Figure 9. A random untrapped SAW of length 5000 obtained via importance 
sampling (left), and a quasi- uniform SAW of length 20000 (right). 

4.3. When does a walk get trapped? 

One important feature of Knuth's algorithm, and of its adaptation to unconfined SAW dis- 
cussed in the previous subsection, is that one never appends a step that would trap the walk. 
Since Knuth does not explain in his paper how he detects self-trapping, let us describe the 
method we used. 

Let w be an untrapped SAW of length n, ending at vertex u„ = {i,j), and, say, with a W 
step. There are, up to obvious symmetries, exactly three situations when adding a new step to 
w creates a trapped walk: 

• the vertex v = (i — belongs to w, one appends a N step to w and the portion of w 
going from v to u„ has winding number — 27r, 

• the vertex v = (i — 1, j + 1) belongs to tw, one appends a N step to w and the portion of 
w going from v to v„ has winding number — 27r, 

• the vertex v = (i — 1, j -I- 1) belongs to w, one appends a W or S step to w and the 
portion of w going from v to d„ has winding number 27r. 
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These three cases are depicted in Figure [TU] When computing the winding number, we add a 
half-edge pointing from the East to v (Figure [TT|) . The winding number is then the difference 
between the number of left turns and the number of right turns, multiplied by tt/2. 
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Figure 10. How a walk gets trapped. 
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Figure 11. The winding number between v and Vn is — 27r. 
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